Due to the high activation sparsity and use of accumulates (AC) instead of expensive multiply-and-accumulates (MAC), neuromorphic spiking neural networks (SNNs) have emerged as a promising low-power alternative to traditional DNNs for several computer vision (CV) applications. However, most existing SNNs require multiple time steps for acceptable inference accuracy, hindering real-time deployment and increasing spiking activity and, consequently, energy consumption. Recent works proposed direct encoding that directly feeds the analog pixel values in the first layer of the SNN in order to significantly reduce the number of time steps. Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy. To mitigate this concern, we propose an in-sensor computing hardware-software co-design framework for SNNs targeting image recognition tasks. Our approach reduces the bandwidth between sensing and processing by 12-96x and the resulting total energy by 2.32x compared to traditional CV processing, with a 3.8% reduction in accuracy on ImageNet.
translated by 谷歌翻译
Generative models have been widely applied to solve extractive tasks, where parts of the input is extracted to form the desired output, and achieved significant success. For example, in extractive question answering (QA), generative models have constantly yielded state-of-the-art results. In this work, we identify the issue of tokenization inconsistency that is commonly neglected in training these models. This issue damages the extractive nature of these tasks after the input and output are tokenized inconsistently by the tokenizer, and thus leads to performance drop as well as hallucination. We propose a simple yet effective fix to this issue and conduct a case study on extractive QA. We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets, with a notable average of +1.7 F2 gain when a BART model is trained on SQuAD and evaluated on 8 QA datasets. Further, the model converges faster, and becomes less likely to generate out-of-context answers. With these findings, we would like to call for more attention on how tokenization should be done when solving extractive tasks and recommend applying consistent tokenization during training.
translated by 谷歌翻译
This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants$\unicode{x2014}$what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read as a physics of intelligence, and which inherits from the physics of self-organization. In this context, we understand intelligence as the capacity to accumulate evidence for a generative model of one's sensed world$\unicode{x2014}$also known as self-evidencing. Formally, this corresponds to maximizing (Bayesian) model evidence, via belief updating over several scales: i.e., inference, learning, and model selection. Operationally, this self-evidencing can be realized via (variational) message passing or belief propagation on a factor graph. Crucially, active inference foregrounds an existential imperative of intelligent systems; namely, curiosity or the resolution of uncertainty. This same imperative underwrites belief sharing in ensembles of agents, in which certain aspects (i.e., factors) of each agent's generative world model provide a common ground or frame of reference. Active inference plays a foundational role in this ecology of belief sharing$\unicode{x2014}$leading to a formal account of collective intelligence that rests on shared narratives and goals. We also consider the kinds of communication protocols that must be developed to enable such an ecosystem of intelligences and motivate the development of a shared hyper-spatial modeling language and transaction protocol, as a first$\unicode{x2014}$and key$\unicode{x2014}$step towards such an ecology.
translated by 谷歌翻译
Adaptation-relevant predictions of climate change are often derived by combining climate models in a multi-model ensemble. Model evaluation methods used in performance-based ensemble weighting schemes have limitations in the context of high-impact extreme events. We introduce a locally time-invariant model evaluation method with focus on assessing the simulation of extremes. We explore the behaviour of the proposed method in predicting extreme heat days in Nairobi.
translated by 谷歌翻译
Both industry and academia have made considerable progress in developing trustworthy and responsible machine learning (ML) systems. While critical concepts like fairness and explainability are often addressed, the safety of systems is typically not sufficiently taken into account. By viewing data-driven decision systems as socio-technical systems, we draw on the uncertainty in ML literature to show how fairML systems can also be safeML systems. We posit that a fair model needs to be an uncertainty-aware model, e.g. by drawing on distributional regression. For fair decisions, we argue that a safe fail option should be used for individuals with uncertain categorization. We introduce semi-structured deep distributional regression as a modeling framework which addresses multiple concerns brought against standard ML models and show its use in a real-world example of algorithmic profiling of job seekers.
translated by 谷歌翻译
不同光谱系统的相互不兼容是激光诱导的分解光谱法(LIBS)的最大因素之一。由于需要广泛的校准,与设置新的Libs系统有关的成本增加了。解决该问题将实现实验室间参考测量和共享光谱库,这对于其他光谱技术至关重要。在这项工作中,我们研究了该挑战的简化版本,其中LIBS系统仅在使用的光谱仪和收集光学方面有所不同,但共享设备的所有其他部分,并同时从相同的等离子体羽流中收集光谱。用作异质标本的高光谱图像测量的广泛数据集用于训练可以在系统之间传递光谱的机器学习模型。转移是由由变量自动编码器(VAE)和完全连接的人工神经网络(ANN)组成的管道实现的。在第一步中,我们获得了在初级系统上测量的光谱的潜在表示(通过使用VAE)。在第二步中,我们将光谱从二级系统映射到潜在空间中的相应位置(ANN)。最后,从潜在空间重建二级系统光谱到主要系统的空间。通过几个优点(欧几里得和余弦距离,都在空间上解析; k-均值的转移光谱聚类)来评估转移。将该方法与几种基线方法进行比较。
translated by 谷歌翻译
我们真的可以“读眼中的思想”吗?此外,AI可以帮助我们吗?本文通过引入机器学习系统来回答这两个问题,该系统在他们的脸上预测个人的人格特征。通过在观看一系列15个不同类型的短视频的同时跟踪个人面孔的情感响应来跟踪个人面孔的情绪反应。为了校准系统,我们邀请了85人观看了视频,而他们的情绪反应是通过他们的面部表情分析的。与此同时,这些人还采取了四次验证的人格特征和道德价值观调查:修订的新FFI人格库存,海底道德基础测试,施瓦茨个人价值体系以及特定于域特定的风险规模( Dospert)。我们发现,通过对脸部所示的情绪反应,可以通过它们的情绪反应来预测个人的人格特征和道德价值,使用梯度提升的树木的准确性高达86%。我们还发现,不同的视频更好地预测不同的人格特征,换句话说,没有单个视频将为所有人格特征提供准确的预测,但是它是对允许准确预测的不同视频的混合的响应。
translated by 谷歌翻译
在源代码中自动定位易受攻击的陈述至关重要,以确保软件安全性和缓解开发人员的调试工作。这在当今软件生态系统中变得更加重要,其中易受攻击的代码可以在像GitHub这样的软件存储库中轻松且无意中流动。在这类数百万的代码行中,传统的静态和动态方法争取缩放。虽然基于机器学习的方法在这样的设置中看起来很有希望,但大多数工作都在较高的粒度下检测到脆弱的代码 - 在方法或文件级别。因此,开发人员仍然需要检查大量代码以找到需要修复的弱势陈述。本文提出了一种新的集合学习方法来定位脆弱的陈述。我们的模型结合了基于图形的基于序列的神经网络,以成功捕获程序图的本地和全局上下文,并有效地了解代码语义和易受攻击的模式。为了研究天鹅绒的效果,我们使用了一个现成的合成数据集和最近发布的现实世界数据集。在静态分析设置中,未提前检测到易受攻击功能,Velvet可以实现4.5倍的性能,而不是真实世界数据上的基线静态分析仪。对于孤立的漏洞本地化任务,在我们假设特定漏洞声明未知的同时知道函数的漏洞,我们将天鹅绒与几个神经网络进行比较,这些内部网络也参加了本地和全局代码背景。天鹅绒分别达到99.6%和43.6%的13.6%,分别在合成数据和现实世界数据上实现了高精度,优于基线深度学习模型5.3-29.0%。
translated by 谷歌翻译
大型变压器模型在许多任务中产生令人印象深刻的结果,但培训昂贵,甚至微调,如此慢,在解码中,他们的使用和研究变得无法触及。我们通过利用稀疏性来解决这个问题。我们研究变压器中的所有层的稀疏变体,并提出缩放变压器,一个缩放变压器模型,使用稀疏层的型号有效地缩放,并在我们扩展模型大小时比标准变压器更快地执行不匹配的解码。令人惊讶的是,稀疏层足以获得与具有相同数量的参数的标准变压器相同的困惑。我们还与现有的稀疏性融合,即使存储器有限,也能够对长期序列进行快速推断。这导致在长期摘要上对最先进的表现竞争。
translated by 谷歌翻译
在线广告收入占发布者的收入流越来越多的份额,特别是对于依赖谷歌和Facebook等技术公司广告网络的中小型出版商而言。因此,出版商可能会从准确的在线广告收入预测中获益,以更好地管理其网站货币化战略。但是,只能获得自己的收入数据的出版商缺乏出版商广告总市场的整体视图,这反过来限制了他们在他们未来的在线广告收入中产生见解的能力。为了解决这一业务问题,我们利用了一个专有的数据库,包括来自各种各样的地区的大量出版商的Google Adsense收入。我们采用时间融合变压器(TFT)模型,这是一种新的基于关注的架构,以预测出版商的广告收入。我们利用多个协变量,不仅包括出版商自己的特征,还包括其他出版商的广告收入。我们的预测结果优于多个时间范围的几个基准深度学习时间系列预测模型。此外,我们通过分析可变重要性重量来识别显着的特征和自我注意重量来解释结果,以揭示持久的时间模式。
translated by 谷歌翻译